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Abstract 

Among the various procedures used to detect potential changes in a stochastic process the moving 
sum algorithms are very popular due to their intuitive appeal and good statistical performance. 
One of the important design parameters of a change detection algorithm is the expected interval 
between false positives, also known as the average run length (ARL). Computation of the ARL 
usually involves numerical procedures but in some cases it can be approximated using a series 
involving multivariate probabilities. In this paper, we present an analysis of this series approach 
by providing sufficient conditions for convergence and derive an error bound. Using simulation 
studies, we show that the series approach is applicable to moving average and filtered derivative 
algorithms. For moving average algorithms, we compare our results with previously known 
bounds. We use two special cases to illustrate our observations. 

Key words: Average Run Length, Change Detection, Moving Average, Filtered Derivative, 
Control Charts 



1. Introduction 



The problem of detecting a change in the mean value of a process, when both the change 
point and change magnitude are unknown, is of great importance in various disciplines such as 
econometrics, engineering, quality control, technical analysis of financial data, and edge detection 
in image processing. The optimal scheme, which involves Maximum-Likelihood estimation of 
both the change point and the change magnitude, is computationally prohibitive. Hence various 
simple but suboptimal methods like ordinary moving average (MA), exponentially weighted 
moving average (EWMA), and filtered derivative (FD) are used in practice. For example, MA is 



used in technical analysis of financial data, like stock prices, returns or trading volumes (Murphy 
p99| . 

To apply the MA scheme a finite, (say k), immediate past samples are added with equal 
weights l/k, while in EWMA, the past samples are combined with exponentially decreasing 
weights. A generalization of the MA scheme is where the past k samples are combined with 
arbitrary positive weights (see (Lai 1974 Bohm and Hackl 1990 1) . In other applications, e.g., 



filtered derivative in edge detection (Basseville 19811, the past k samples are combined with 



both positive and negative weights. In this paper, such generalizations are referred to as moving 
sums (MOSUM). In this paper, we study the most commonly used MOSUM algorithms - namely 
the MA and FD schemes. 

Change detection schemes are assessed on the basis of the statistical distribution of the run 
length, i.e., the number of test samples taken before a false positive is detected. For most practical 
purposes, the distribution function of run length is adequately summarized by its expected (or 
average) value, also known as the average run length (ARL). Unfortunately, for most practical 
schemes closed form expression for ARL is difficult to obtain. For the MA scheme, the bulk 
of the research so f ar has been dedicated to either tab ulating numerical results through Monte 
Carlo simulations flSAS/QC® 9.1 User's Guide] |2004| ) or deriving bounds using multivariate 



♦Corresponding author. Tel: (315) 751-1370 Fax: (315) 443-4745 
Email addresses: swkar@syr.edu (Swarnendu Kar), mehrotraSsyr.edu (Kishan G. Mehrotra), 
varshneyOsyr.edu (Pramod K. Varshney) 



August 20, 2009 



probability distribution functions (MPDFs) (Bohm and Hackl 19901. Very little work has been 
done to date regarding the ARL of FD scheme. 

The ARL can be written as the sum of an infinite number of MPDFs of increasingly higher 
dimensions. But MPDFs, in general, can be computed only numerically and the computational 
intensity increases with the dimension of the multivariate vector. While addressing the problem 
of approximating the ARL for EWMA algorithms, it was observed by Robinson and Ho (19781 
that the ratio of the successive MPDFs converge as the dimension increases. This fact was used 
to propose a series based approach of approximating the ARL. For the MOSUM algorithm with 
positive weights and by using only fc th order MPDFs, upper and lower bounds were proposed by 
Lai ( 1974 1 and improved by Bohm and Hackl ( 1990|. This was based on the idea that an MPDF 



of dimension larger than k can be bound (both above and below) by products of lower order 
MPDFs. Consistent with the name of the authors, we refer to those results as LBH bounds. 

In this paper, we analyze the series approach of Robinson and Ho (19781 by laying out 
sufficient conditions for convergence and also provide an error bound. With simulation stud- 
ies, we demonstrate the versatility of this approach by showing that the conditional survival 
probabilities also converge for MA and FD algorithms. For MA algorithms, we compare the 
actual convergence vis-a-vis the LBH bounds. Through simulation studies for both MA and FD 
schemes, we demonstrate that a satisfactory approximation of the ARL can be obtained by using 
only |~fc/2] th order MPDFs. Compared to fc th order MPDFs as required by the LBH bounds, 
this provides a significant saving in computation for the MA scheme. 

The rest of this paper is organized as follows. We introduce some notations in Section [2] 
In Section [3] we summarize the main results of this paper. We examine the convergence of 
conditional survival probabilities for MOSUM algorithms in Section [4j In Section [5] we compare 
the convergence of ARL for MA algorithms with the LBH bounds. We also demonstrate that 
MPDFs of |~/c/2] th order provides a reasonable approximation of the ARL. Concluding remarks 
are provided in Section [6] 



2. Notations 



Let X i , X2 , . . . X ri 



be a sequence of observations obtained from a discrete time random 



process. We assume that X^s are independently distributed random variables with variance 
a 2 . It is assumed that the mean of Xi's possibly changes from /1 to 9 at some point, here 
cr 2 , /1 and 9 > /1 are unknown parameters and possible time of change is also unknown. For 
detecting a change in the mean value of X^s from we need to formulate an appropriate linear 
test statistic, say Y m = YnLm-k+i c m-iXi, where Ci are constants and compare it against some 
threshold. Roberts (19591 has considered both the MA and EWMA schemes for appropriately 



chosen weights. A generalization of MA was considered by Bohm and Hackl ( 1990 1, where c, > 
for all i's. There are other applications where all the weights need not be positive. For example, 



in the context of edge detection, Basseville ( 1981 1 uses the following test statistic 



Y 



m-k/2 

E 

i—m—k-\-l 



x, 



E 

i=m-k/2+l 



X, 



where k is assumed to be an even number. This is also known as the filtered derivative (FD) 
algorithm, since we take the difference of averaged (filtered) block of samples. 

To test whether the process mean is or has shifted to 9, the test statistic Y m is monitored 
for successive values of m and compared against an upper threshold, say h. The threshold is 
sometimes also specified as multiples of the standard deviation in excess of the mean of the test 
statistic, i.e. by the quantity 6 defined by 

. h - E(Y ro ) 



VVar(y m ) ' 



In this paper, MOSUM ([cq, a, . . 
and threshold 5. 



, Cfc_i], S) denotes a moving sum algorithm with weights cq,c±, . . . , Ck-i 
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The time elapsed before Y m exceeds the thresholds for the first time is also known as the run 
length (RL) or stopping time. 

RL = max{m : Y m < h} 

In this paper, we are interested in the average run length (ARL), namely the expectation of RL, 
i.e., in 

L = E(RL) 

To show that L can be represented as an infinite sum of probability distributions we define: 

• p n , the probability that RL= n, i.e., the test passes n — 1 consecutive times but fails at 
the n th instant; 

p n = P(Y k+1 ,Y k+2 , . . .,Yk+n-i < h,Y k+n > h). 

• q n , the survival probability is the probability that the test passes n consecutive times 

q„ = P(Y k+1 ,Y k+2 , . . . , Y k+n < h) 

• r n , the conditional survival probability that the test survives at a particular instant given 
that it has already survived the past n — 1 times, 

r n = P(Y k + n < h\Y k +i, Y k +2, • • • , Yk+n— 1 

< h) 

It follows from these definitions that p n = q n -i — q n and r n = q n /q n -i- Since the evaluations 
start at index k, the ARL function can be represented as 

oo 

L = k — 1 + np n 

71=1 

OO 

= fc+^q„ (1) 

n=l 

In this paper, we use ([T]) to cither derive closed form expressions or provide approximate results. 
3. Main results 

In Theorem [l] we obtain closed form expressions for L for two special cases. 

Theorem 1 (Two special cases). Let X\, X 2 , ■ ■ ■ X n , . . . be zero-mean i.i.d. random variables 
with symmetric pdf. Then the following results apply, 

1. For MOSUM([-l, 1},0), L = exp(l) w 2.7183. 

2. For MOSUM([1,1],0), L = sec(l) + tan(l) w 3.4082. 

Proof. 1. For Co = 1 and c\ — —1, the q n is given by 

q„ = P(X 1 -X 2 <0,X 2 -X 3 <0,...,X n - X n+X < 0) , 

= P{X 1 <X 2 < X 3 --- <X n+1 ). (2) 

We recall that X\,X 2 , . . . , X n+ i are i.i.d. random variables. If we draw n + 1 independent 
samples from the same distribution and order them, they can result in any one of the 
(n+1)! possible orderings with equal probability. Since Q denotes only one such ordering, 
we conclude that 

1 

q n = 



(n+1)! 



This result is well known for the Gaussian random variables ( Barlow, Barthoromew, Brem- 
ner, and Brunk 1972). Using (JTJ, we obtain 
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2. Next we consider cq = l,ci = 1. The MPDF q n was derived in the context of Gaussian 



random variables in (Moran, 19831, but a careful analysis of the proof reveals that the 
argument is valid for any symmetric distribution. Thus, quoting (Moran 19831, q n is 
given by the coefficient of z n+1 in the power series expansion of sec(z) 
point z = 0. Since, 



tan(c) around the 



sec(z) + tan(z) = 1 + z + ^ q n z n+1 



n=l 



Evaluating the above series at z = 1 and using ([l]), we obtain 

L = scc(l) + tan(l). 



□ 



These two special cases may not be of any practical use, nevertheless, we use these cases as 
illustrative examples in our discussions in Section [4] 

3.1. Previous work on approximating ARL using MPDFs 

We observe that q n is an MPDF of n-dimensions involving correlated variables Yfc+i, ifc+2, • • • , Yk+n- 
In general, closed form expressions for q n are not available; the values are computed numerically. 
The intensity of these computations increases with n. As a result, the summation in the form of 
can seldom be used to compute the ARL. Various methods for approximating the ARL have 
been proposed by the researchers; we briefly describe two such approaches due to Lai ( 1974 1 and 



Robinson and Ho (19781 below 



For EWMA, and using simulations, Robinson and Ho (1978) observed that as n increases the 
conditional survival probabilities {r n } appear to converge. If we assume that ~ r n ,V« > n, 
then the future survival probabilities can be approximated as g, ~ g,_ir n ,Vi > n. By using (fill, 
we obtain the n th order approximation of L as: 



L n = k + ^2 1i + + r n + r n + ...) 

i=l 
n-1 



(3) 



Thus, in the ARL is approximated by using only a few lower order MPDFs. Another 



significant result due to Lai (19741 and improved by Bohm and Hackl (19901 provides an upper 



and a lower bound on ARL for MO SUM algorithms with positive weights, as follows: 



qk 
Pk 



< L<k 



Ok 
Pk' 



(4) 



We denote the lower and upper bounds (together we call them LBH bounds) in Q by L[ and L u 
respectively. These bounds are significant because they are asymptotically the same, i.e., when 
(L) is large compared to the span k, the upper and lower bounds are almost equal. The ARL 
can then be approximated as L « qk/pk, which requires the computation of MPDFs of order k. 

Very little work can be found in the literature, if any, regarding the approximation of ARL 
for the FD scheme. 



3.2. Convergence analysis of {r n } and {!„} 

In Theorem [2] we present an analysis of the series in |3]). In particular, we derive an error 
bound that relates the convergence of L n to that of r n . 

Theorem 2 (Convergence and error bound). Assume L < 00, r n — > r for some r € [0, 1). Then 
1. L n -» L. 
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2. Choose e such that < e < 1 — r. If \r n — r\ < e for all n > m, then 



L n -L 

L 



2e / 7, ^ 

< — for all n > m 

(1 — r — ey 



and if, in particular, convergence {r n }m+i * s rnonotonic, then 

e 



L n - L < 



(1 - r- e) ; 



for all n> m 



(5) 



(6) 



Proof. 1. If L < oo, then from equation ([I]) it follows that q n —> 0. Taking limits on the right 
hand side of pj, we complete the first result. That is: 



urn iy„ = LI = L. 

n — >oo 1 — T 

2. Choose e > 0. Since {r n } — > r, there is an index m such that 

|r,- — r\ < e, Vi > TO. 



(7) 



To obtain the desired bounds on L we first note that L — linin^oo L n and write this 
limiting value as a telescopic sum to obtain: 



L — L m + (L m _|_i — L m ) + (L m+ 2 — L m +i) + 

oo 

= L m + (Li — 

For i > 2, we use ([3]) to obtain 

T T Qi Qi-i 
Li — L*i-\ — qi-i + 

i - n i - n-i 

qi-i qi-i 



(8) 



l-n i - n-i 
n - n-i 



£?»-!- 



•(l-rOa-fi-x) 
Using ([9| in (JsJ , we can bound the approximation error as follows 



(9) 



(a) 
< 



oo 



E 



1", - Tj-i 

(l-r<)(l-ri_i) 
In - n-ij 

(l-noa-n-i) 



(b) 

< 



(1-r-e) 2 



(10) 



i—m+l 



where (a) follows since qi, 1 — r% > 0, Vi, and (b) is due to our initial assumption that 
1 — r — e > and from ([7]), we can obtain that 1 — r — e < 1 — r$, Vi > to. 
From (10 1, we can proceed to obtain (p>| as follows 



iin — i 



(a) 1 

< — 



2e 



(b) 
< 



2e 



(1-r-e) 5 



where (a) follows from the fact that for i > to + 1, |r< — r»_i| < |r» — r| + |r*i_i — r\ < 2e 
and (b) follows from the infinite sum in ([lj. 
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From (10 1, assuming {?"n}m+i to be monotonic, we can proceed to obtain ^ as follows 

oo 

l r i _r *-ll 

=m+l 

oo 



(a) 




1 




(1 

V 1 


— T — 


e) 2 


(b) 




l 






(1 


— r — 




(o) 




1 






(1 


— r — 


ef 


(d) 
< 




e 




(1 


— r — 


6)2 



i—m+l 



(a) is due to fact that qt < 1 Vi, (b) follows from the monotonicity of {r n }m+ii ( c ) follows 
from an argument exactly similar to the one used to derive (|8| and (d) is due to assumption 
0. 



□ 



Applicability of Theorem [2] relics on the convergence of {r n }. We discuss the convergence of 
{r n } for some moving sum algorithms in the following section. 



4. Convergence of {r n } for Moving Sums 

We use simulation studies to examine the convergence properties of r n for MA and FD 
algorithms. In all our simulations, it was assumed that the error variables Xi are normally 
distributed. The MPDFs were calculated as multivariate normal CDFs using the technique in 
We have used the software available from the author's website 
We fix the standardized thresholds at 6 



(Genz 1992) 



We evaluate r n 
and 5 = 2. The 



for various values of the span k. 
graphs are displayed in Figure [T] 

In Figure 1(a) we observe that r n achieves its limiting value very quickly after some initial 
zig-zag pattern. This limiting value, r, depends on k and it is observed that the length of zig- 
zagedness depends on the value of k. This behavior is more pronounced for 5 = 0. For the 
case of filtered derivatives r n (see Figures 1(c) and |l(d)" l monotonically decrease after initial 
minor deviations. Consequently, for the filtered derivative model the tighter bound, given by 
equation ([6]), can be applied. The monotonic convergence (r n — > 0) follows for the special case 
of MOSUM([— 1, 1], 0) from Theorem [T] where we had observed that the survival probability 
q n = ( n ^!y and therefore the conditional survival probability is r n = ^q-j- which converges to 



zero. 



The zig-zag property, which is most pronounced in Figure l(a)| is easy to explain for the 
special case of MOSUM([l, 1], 0). From Theorem [l] we know that q n is the coefficient of z n 
in the power series expansion of sec(z) + tan(z) around z = 0. Thus, we obtain the survival 
probabilities as: 



{q n } 



1 1 5 2 61 
2' 3' 24' 15' 720' 



The conditional survival probabilities can therefore be computed as {r„} = {§, §, 55, ' ' " } 
{0.667, 0.625, 0.640, 0.635, • • ■ }. It is a known result (|Sloane[ |2009[) that r n converges to 2/vr 



0.6366. Also {r n } decreases and increases alternately, giving the series a zig-zag appearance of 
period 2. For moving averages of span k, we observe that the zig-zag pattern persists with period 
k. 

Based on the observations in Figure[l] several conjectures can be made about the convergence 
of the conditional survival probabilities {r n } in MOSUM algorithms. 

• The sequence {r n } converges to some r as n incresaes, 

• the limit point r is an increasing function of both the k and the threshold 5, 



• the convergence to r is faster for higher thresholds. 
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(a) Moving average algorithm with low threshold 
(5 = 0) 



(b) Moving average algorithm with high threshold 
(5 = 2) 





Index (n) 

(c) Filtered derivative algorithm with low threshold 
(5 = 0) 
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(d) Filtered derivative algorithm with high thresh- 
old (S = 2) 



Figure 1: Convergence of conditional survival probability (r n ) for one-sided moving average and filtered derivative 
algorithms considering low (<5 = 0) and high (<5 = 2) thresholds. 
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Index (n) 

Figure 2: Convergence of {L„} and comparison with L;, L u for span k = 8 and threshold given by q\ = 0.95 



It was shown in Theorem [2] that convergence of {r n } is a sufficient condition for the applicability 
of the series approach of approximating the ARL. In the next section, we apply the series 
approach to MA and FD algorithms. 



5. Comparison with LBH bounds and approximation of ARL 

First we consider moving average algorithms and compare the convergence of the series 
approach (JlJ with the LBH bounds given by Q. Since the filtered derivative algorithms have 
negative weights, the LBH bounds are not applicable. 

A representative example, demonstrating the convergence of {L n }, is shown is Figure|2]for k = 
8. We considered the threshold h such that that qi = P(Y% < h) = 0.95. We have also plotted the 
ARL obtained via Monte-Carlo simulations. We observe that the series enters the region bounded 
by Li and L u fairly rapidly. Though the LBH bounds could only be calculated using MPDFs of 
8 th order, reasonable approximation can be obtained using lower order MPDFs. In Figure [3] we 
demonstrate this fact for different thresholds. Since the ARL varies with threshold, we compared 
them with their respective LBH bounds. We plot the ARL in excess of the corresponding lower 
LBH bound, i.e., we plot the quantity L u — hi. We note that L u — Li = k — 1. The [0, k— l]-lines 
(dotted) represent hi and h u respectively. 

We tabulate the ARL for moving average and filtered derivative algorithms in Table [l] and 
compare them with approximate values obtained using the series approach in ([3]). We show the 
approximation L[fc/2] ; which uses only [/c/2] th order MPDFs. For moving average algorithm with 



span k < 10, the reference values of h were obtained from |SAS/QC® 9.1 User's Guide] (|2004| 



All other reference values were obtained using Monte-carlo simulations. We have considered 
only one-sided tests and the ARLs are tabulated in Table [l] We conclude that L[fc/2] provides 
a reasonable approximation of h. 



6. Conclusion 

In this paper, we have considered the approximation of ARL for moving sum algorithms 
with arbitrary weights using multivariate probabilities. Specifically, we have considered moving 
average and filtered derivative algorithms. We have applied a series approach that was originally 
proposed by |Robinson and Ho ( 1978[ ) for geometric moving average algorithms. We have pre- 
sented an analysis of the convergence of the series. We have shown using simulation studies that 
multivariate probabilities of order \k/2] can provide reasonable approximations of the ARL. We 
have also derived the ARL for two special cases of MOSUM algorithms, and have used them as 
illustrative examples. 
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Figure 3: Convergence of {L n } and comparison with L;,L U for span k = 8 and various thresholds 
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(a) ARL for one-sided moving averages 
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(b) ARL for one-sided filtered derivatives 
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Table 1: Comparison of the ARL with its series approximation. For both moving average and filtered derivative 
algorithms of span k, the |"fc/2] th order approximation is reasonably accurate. 
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